The Scanning Mobility Particle Sizer (SMPS) is a high resolution particle sizer that is commonly used in research for characterizing the size distribution of aerosols.
This py-smps python library is a simple way to read in the data, analyze it, and visualize it. A loader (smps.io.load_file) can be used to import the data from the SMPS, and two plotting functions are available (smps.plots.heatmap, smps.plots.histplot).
Below is a quick tutorial to show how to import the data, look at it, and plot it. Any bugs with the software can be reported on github.
I personally recommend using python3 and heavily leaning on seaborn for visualization help. There are three required packages for this library:
To make the process seamless, I recommend exporting your data from the SMPS with the settings in column format with a ',' delimiter. For units, using dN/dlogDp is preferred, as it is the natural format for aerosol distributions. I have made available an ambient dataset which is available here.
The beautification of plots is aided by using seaborn. For more information, check out their documentation! It's great.
In [1]:
import smps
import seaborn as sns
import os
import matplotlib
import matplotlib.pyplot as plt
import json
%matplotlib inline
# You can use seaborn to easily control how your plots appear
sns.set('notebook', style='ticks', font_scale=1.5, palette='dark')
smps.set()
print ("smps v{}".format(smps.__version__))
print ("seaborn v{}".format(sns.__version__))
print ("matplotlib v{}".format(matplotlib.__version__))
The SMPS loader (smps.io.load_file) returns an SMPS object which has several attributes including:
SMPS.rawSMPS.dfSMPS.metaSMPS.binsSMPS.midpointsSMPS.bin_labelsSMPS.histogramsmps.io.load_file(fpath, column=True, **kwargs)fpath: File Path for the datacolumn: If your data is in 'column' format, set True. Otherwise, set False
In [2]:
bos = smps.io.load_sample("boston")
In [3]:
print (json.dumps(bos.meta, indent=4))
SMPS.bins and SMPS.midpointsSMPS.bins is an nx3 array that contains the left, middle, and right side of each bin in the dataset. SMPS.midpoints is simply the center column of bins. NOTE: All diameters are expected to be in nm. This can be changed by altering the dp_units argument. All diameters are then promptly converted to microns.
In [4]:
# print out the first 4 bins
bos.bins[0:4]
Out[4]:
In [5]:
# print out the midpoints
bos.midpoints
Out[5]:
In [6]:
# Display the first few rows of the DataFrame
bos.data.head(3)
Out[6]:
SMPS.statsSMPS.stats contains the statistics generated by the SMPS. You can weight by number, surface area, volume, or mass and the results include the total number of particles, total surface area, total volume, total mass, the arithmetic mean (AM), the geometric mean (GM), the mode, and the geometric standard deviation (GSD).
In addition, you can integrate or calculate the stats over just a small section of the distribution by leveraging the dmin and dmax arguments.
In [7]:
bos.stats(weight='number').head()
Out[7]:
In [8]:
bos.scan_stats.head()
Out[8]:
We can go ahead and resample the data by mean if we would like to! Under the hood, this method splits the raw dataframe into numeric and non-numeric columns before resampling by mean the numeric columns and the non-numerics by 'first'. If inplace=True, then it will save the resampled data and replace the current raw dataframe. Otherwise, it will return a copy of the object.
In [9]:
bos.resample("5min", inplace=True)
bos.data.head(3)
Out[9]:
Okay. All we really want to do is visualize our data, right? Two common plots are a heatmap-like plot (smps.plots.heatmap) and a particle size distribution (smps.plots.histplot).
Here, we show how to use both of them. Each one returns a matplotlib axis object which can easily be manipulated as you would any other matplotlib object. This makes it easy to alter how they look, add lables, etc.
smps.plots.heatmap(X, Y, Z, ax=None, kind='log', cbar=True, cmap=default_cmap, fig_kws=None, cbar_kws=None, **kwargs)Okay, so all you really need to do to plot the heatmap is give it your X, Y, and Z data:
X: Time AxisY: Bin midpointsZ: Data (usually in the format of $dN/dlogD_p$)You may think the default colormap is not ideal (it probably isn't), so you can easily change it by feeding it any valid matplotlib colormap object. You can read more about those here or here.
In [10]:
X = bos.dndlogdp.index
Y = bos.midpoints
Z = bos.dndlogdp.T.values
ax = smps.plots.heatmap(X, Y, Z, cmap='viridis', fig_kws=dict(figsize=(14, 6)))
# make the x axis dates look presentable
import matplotlib.dates as dates
ax.xaxis.set_minor_locator(dates.HourLocator(byhour=[0, 6, 12, 18]))
ax.xaxis.set_major_formatter(dates.DateFormatter("%d\n%b\n%Y"))
# Go ahead and change things!
ax.set_title("Cambridge, MA Wintertime SMPS Data", y=1.02, fontsize=20);
smps.plots.histplot(histogram, bins, ax=None, plot_kws=None, fig_kws=None, **kwargs)To plot a histogram, you need to provide two pieces of information:
histogram: Your histogram data! You can provide it as an array, or as a DataFrame (it will be averaged out)bins: Bin midpointsThere are plenty of ways to customize these plots. You can provide additional keyword arguments for the matplotlib bar chart (plot_kws) or the figure itself (fig_kws). You can also plot on an existing axis by providing that argument.
In [11]:
ax = smps.plots.histplot(bos.dndlogdp, bos.bins, plot_kws={'linewidth': .01}, fig_kws=dict(figsize=(12,6)))
ax.set_title("Cambridge, MA Wintertime Size Distribution")
ax.set_ylabel("$dN/dlogD_p \; [cm^{-3}]$")
sns.despine()
In [12]:
dates = ["2016-11-23", "2016-11-24", "2016-11-25"]
ax = None
for i, date in enumerate(dates):
color = sns.color_palette()[i]
plot_kws = dict(alpha=0.65, color=color, linewidth=0.)
ax = smps.plots.histplot(bos.dndlogdp[date], bos.bins, ax=ax, plot_kws=plot_kws, fig_kws=dict(figsize=(12, 6)))
# Add us a legend!
ax.legend(dates, loc='best')
ax.set_ylabel("$dN/dlogD_p \; [cm^{-3}]$")
# Remove the spines
sns.despine()
In [ ]: